Notes (Week 8 Thursday)

1. Notes
These are notes I presented in today's lecture, as well as additions and clarifications made afterwards including further answers to some questions asked in class. Many thanks to Craig McLaughlin and Johannes Åman Pohjola for the original notes.
1. Notes

Most General Type
-----------------

Example: What's the most general type of
recfun f x = fst x + 1 ?

f : ∀α.α → α ? -- No....why not?

f : ∀α. (Int, α) → Int -- This one.

Takeaway:
Most general = least commitment, most widely applicable


-> Why can't we have that for recursive types?

Consider these two recursive types:

List = type a. rec t. () + (a, t)
Tree = type a. rec t. () + (a, t, t)

roll(inl ()) : List@Bool ?
roll(inl ()) : Tree@Int ? We just don't know.

Takeaway:
Values of recursive types will not have a unique most general type.

Note: In Haskell these types are
data List a = Nil | Cons a (List a)
data Tree a = EmptyTree | Node a (Tree a) (Tree a)

i.e. Haskell will require unique constructor names, so it
doesn't end up with this ambiguity from just using `roll`.


-> Don't sums have the same problem?

Consider also

inl() : () + Int
inl() : () + Bool

Unlike the recursive types, these *do* have a unique most general type:

inl() : () + a


Why Func adds to the type env
-----------------------------

Consider this attempt to infer the type of a recfun if the
Func rule doesn't add f, x to the typing environment Γ:

 __________________   ___________
 Γ ⊢ f : Int -> Int   Γ ⊢ x : Int
         ______________          _________
         Γ ⊢ f(x) : Int           1 : Int
           ____________________
            Γ ⊢ f(x) + 1 : Int
 ___________________________________ Func?
 Γ ⊢ Recfun f x = f(x) + 1   : Int -> Int

The problem is Γ doesn't tell us anything about the
f and x, which are local to the recfun.

But with f and x added to the environment by Func as
follows, we are then be able to prove the upper-left
parts of the above derivation.

        etc.                   etc.
            ____________________
    f : Int -> Int, x : Int, Γ ⊢ f(x) + 1 : Int
 ___________________________________ Func
 Γ ⊢ Recfun f x = f(x) + 1   : Int -> Int


Generalising
------------

Γ ⊢ e : τ  α ∉ FTV(Γ)
────────────────────── AllI
     Γ ⊢ e : ∀α.τ

The example given on the slide was:

  let f = (recfun f x = (x, x)) in (fst (f 4), fst (f True))

with the prompt "Where should generalisation happen?"

The problem is that this rule is applicable for every possible
expression - we could apply it when e is any of the following
expressions, as long as we pick an unused type variable
α ∉ FTV(Γ) to generalise over:

e = (let f = (recfun f x = (x, x)) in (fst (f 4), fst (f True)))
e = (recfun f x = (x, x))
...

Making a generalisation rule applicable only to let-expressions
(or the top-level of the entire program) means there is no
ambiguity about when to apply it, resulting in the kind of
deterministic behaviour we want in a type inference algorithm.


Type Inference Algorithm Preliminaries
--------------------------------------

  We'll extend our grammar for typing contexts to include type variable
  declarations and definitions, and **markers** (important later)

  typing contexts Γ ::= . | Γ,x:σ | Γ,α | Γ,α:=τ | Γ #

  Types now contain type variables which may be "free" or "bound" just
  like term variables in expressions:

     R α -- rigid type variables -- bound by forall quantifiers
     F α -- flexible type variables -- introduced during type inference

  types τ ::= F α | R α | Bool | Int | τ → τ

  Forall quantified types are called type *schemes*:

  type schemes σ ::= ∀[α].τ

  Minimal calculus for now, just lambdas and let-expressions

  expressions e ::= x | λx.e | e e | let x = e in e


Unification
------------

Our rules for unification are specified by the judgement:

                         Γ₁ ⊢ τ₁ ~ τ₂ ⟹ Γ₂

which are defined such that:

     1. Γ₁ & Γ₂ contain the same type variables;

     2. Γ₂ is **more informative** than Γ₁ in the sense that declared
        type variables have been given definitions in order for τ₁ ~ τ₂
        to hold: Γ₁ ⊑ Γ₂

     3. The information increase is **minimal** (most general) in the
        sense that it makes the least commitment in order to solve the
        equation: any other solution Γ₁ ⊑ Γ' factors through Γ₁ ⊑ Γ₂


Worked Unification Example:
----------------------------

  Example unification problem: 

         α,β,ρ:=F β,γ ⊢ (α → β) ~ (ρ → (γ → γ)) ⟹ updated context

  Idea is to update the context with definitions for the unknowns --
  flexible/unification type variables -- by decomposing the unification
  problem. By decomposing the function type, the above has two
  unification problems:

        1. α,β,ρ:=β,γ ⊢ F α ~ F ρ ⟹ Γ₁
        2. Γ₁ ⊢ β ~ (γ → γ) ⟹ Γ₂

  for some Γ₁, Γ₂. Notice, the input to problem 2 is the output from
  problem 1.

  To solve a unification problem, we move through the context from
  right-to-left looking to simplify by definition either side of the
  equation:

        1. α,  β,  ρ:=β,  γ ⊢ F α ~ F ρ
           γ does not occur in the problem, so we can skip it!

           What we'll use to take these next steps are "unification"
           rules whose conclusions have the form: Γ ⊢ F α ~ F ρ ⟹ Γ'
           For this step, we use rule Skip-Type (see further below).

        ⟹ α,  β,  ρ:=β ⊢ F α ~ F ρ , γ

           ρ occurs on the right-hand side and has a definition:
           substitute!

        ⟹ α,  β ⊢ F α ~ F β, ρ:=β , γ
           β occurs on the right-hand side, define!

        ⟹ α, β:=α, ρ:=β , γ
           Done.

        2. α, β:=α, ρ:=β , γ ⊢ β ~ (γ → γ)
           γ occurs in the type on the right-hand side: a dependency!

           Dependencies are tracked in a new typing context Δ that
           sits to the right of the original one Γ, separated by a
           bar symbol, as follows: Γ | Δ ⊢ foo ~ goo
             To tackle unification problems of this new form, we use
           "instantiation" rules whose conclusions have that form.

        ⟹ α, β:=α, ρ:=β | γ ⊢ β ~ (γ → γ)
           ρ does not occur: skip!

        ⟹ α, β:=α | γ ⊢ β ~ (γ → γ), ρ:=β
           β occurs on the left-hand side: substitute its definition!

        ⟹ α | γ ⊢ α ~ (γ → γ), β:=α, ρ:=β
           α occurs on the left-hand side: define, placing all its
           dependencies (γ) to the left:

        ⟹ γ, α:=γ → γ, β:=α, ρ:=β
           Done.
                           Γ₁ ⊢ τ₁ ~ τ₂ ⟹ Γ₂

Unification rule design
-----------------------

Define by inductive structure on types:

  types τ ::= F α | R α | Bool | Int | τ → τ

For rigid type variables, the rules are straightforward:

                                α = β
                     ──────────────────────────────Unify-R-Eq
                          Γ ⊢ R α ~ R β ⟹ Γ

                and similar for Int and Bool.

Note, a = b means the string identifiers for type variables
a and b are literally identical (and a != b conversely).

Similar for arrow, which just recurses in structurally:

                           Γ₁ ⊢ τ₁ ~ ν₁ ⟹ Γ₂
                           Γ₂ ⊢ τ₂ ~ ν₂ ⟹ Γ₃
                     ──────────────────────────────Unify-Arrow
                     Γ ⊢ (τ₁ → τ₂) ~ (ν₁ → ν₂) ⟹ Γ₃

Interesting cases when we have: flex-flex and flex-rigid constraints:

First: Unification of a flexible and a flexible?

  typing contexts Γ ::= . | Γ,x:σ | Γ,α | Γ,α:=τ | Γ #

Intuition:
The further right, the more local; the further left, the more global.
We know that things in environment Γ can depend on things
to their left, but not on things to their right.

  A trivial one:

                     ────────────────────────────── Refl
                           Γ ⊢ F α ~ F α ⟹ Γ

  What do we do?

                                α != β
                     ──────────────────────────────
                          Γ ⊢ F α ~ F β ⟹ Γ'

  Consider the cases...


                      α != β    ρ != α     ρ != β
                           Γ ⊢ F α ~ F β ⟹ Γ'
                     ────────────────────────────── Skip-Type
                        Γ,ρ ⊢ F α ~ F β ⟹ Γ', ρ


                                 α != β
                     ────────────────────────────── Defn
                       Γ,α ⊢ F α ~ F β ⟹ Γ,α:=β

               (Note, we also have the symmetric version of
               this Defn rule - that's what's applied to solve
               the unification example's part 1 further up.)


               α != β   Γ ⊢ [ρ:=τ](F α) ~ [ρ:=τ](F β) ⟹ Γ'
                     ────────────────────────────── Subst
                   Γ,ρ:= τ ⊢ F α ~ F β ⟹ Γ', ρ := τ

            where
                [α:=τ](τ → τ') = ([α:=τ]τ) → ([α:=τ]τ')
                [α:=τ](R x) ~> R x
                [α:=τ](F α) ~> τ


                          Γ ⊢ F α ~ F β ⟹ Γ'
                     ────────────────────────────── Skip-Term
                      Γ,x:σ ⊢ F α ~ F β ⟹ Γ',x:σ


Second: Unification of a flexible variable and
        a type that's not a flexible variable

      τ not a flexvar  Γ | [] ⊢ F α ~ τ ⟹  Γ'??
        ────────────────────────────── Inst
               Γ ⊢ F α ~ τ ⟹ Γ' ??

Now we need to apply *instantiation* rules that try to unify the
type variable α with any type τ that is not a flexible type variable.

These rules carry around a new context Δ recording type variable
dependencies of τ that will later need to be put to the left of α.

The following are instantiation rules.


Cases....
FTV(τ) -- free flexible type variables in τ
FTV(F α) = {α}
FTV(τ → τ') = FTV(τ) ∪ FTV(τ')
....

                         ρ != α    ρ ∈ FTV(τ)
                        Γ | Δ, ρ ⊢ F α ~ τ ⟹ Γ'
                        ────────────────────── Depend
                        Γ,ρ | Δ ⊢ F α ~ τ ⟹ Γ'

Note, this rule's hypothesis goes back to being a unification problem: 

                  Γ,Δ ⊢ [ρ:=τ'](F α) ~ [ρ:=τ']τ ⟹ Γ'
                       ────────────────────── Subst
                     Γ,ρ:=τ' | Δ ⊢ F α ~ τ ⟹ Γ',ρ:=τ'

This rule is a base case that finally unifies α with a suitable τ.
Note it has an *occurs check*:

                             α ∉ FTV(τ)
                        ────────────────────── Defn
                     Γ,α | Δ ⊢ F α ~ τ ⟹ Γ,Δ,α:=τ


                      What if  α ∈ FTV(τ) ????

                          F α ~ (F α → F α)
                      an occurs check failure


Questions from the class
------------------------

Q: Is the reason we exclude recursive types to avoid cases like this?

    F α ~ (F α → F α)

A: Apart from avoiding the problem I mentioned where `rec t. blah`
   leads to ambiguity between types from not having uniquely named
   constructors, I do agree avoiding this case is another benefit.

Q: What if we had a type like this?

    MinHS: rec t. (t -> Int)
    Haskell: data Blah = MyBlah (Blah -> Int)

A: I believe MinHS and Haskell will both allow you to declare types
   like this, but then neither of these languages prevent you from
   having non-terminating functions.

   The Agda documentation has a nice example explaining why Agda
   requires all self-reference to be in *strictly positive* position,
   which means never on the left side of an arrow like t is above:
   https://agda.readthedocs.io/en/latest/language/data-types.html#strict-positivity

   In short, admitting types with self-reference in a negative position
   can cause functions on those types to be non-terminating. Agda and
   Isabelle (both interactive theorem provers) require all functions to
   be terminating. In Isabelle this is to avoid the possibility of
   deriving contradictions and I'd assume that's why Agda does it too.

   (See also https://cgi.cse.unsw.edu.au/~cs4161/week05B.pdf slide 8 -
   and if you're curious, the point about violating Cantor's theorem is
   elaborated in https://cgi.cse.unsw.edu.au/~cs4161/week05B_demo.thy)

Q: Is the problem with this type that you'd never be able to
   instantiate it?

A: I see it as one of the problems - but I think the main reason for
   that is that it doesn't have a base case, only an inductive one.
   The following types have a base case (therefore instantiable) but
   can still lead to non-terminating functions due to their
   self-reference being in a negative position.

    MinHS: rec t. (t -> Int) + 1
    Haskell: data Blah = MyBlah (Blah -> Int) | EmptyBlah
Notes (Week 8 Thursday)

Table of Contents

1. Notes